resolve and document most common erasure coded pool pain points #3194

ghost · 2014-12-17T15:12:43Z

No description provided.

loic-bot · 2014-12-17T15:57:39Z

SUCCESS: make check on fc53dc9 output is http://paste.ubuntu.com/9551698/

Sent from GH.

ghost · 2014-12-17T17:25:08Z

Documentation part review by Italo Santos okdokk@gmail.com

loic-bot · 2014-12-17T18:57:10Z

SUCCESS: make check on b47b333 output is http://paste.ubuntu.com/9553221/

Sent from GH.

loic-bot · 2014-12-17T22:40:05Z

SUCCESS: make check on 4c05213 output is http://paste.ubuntu.com/9554907/

Sent from GH.

ghost · 2015-01-06T18:13:55Z

rebased and repushed

loic-bot · 2015-01-06T19:19:39Z

FAIL: the output of run-make-check.sh on 4df7a46 is http://paste.pound-python.org/show/EwJFIOR3jzfcKYk3ArDW/

Sent from GH.

loic-bot · 2015-01-06T21:08:56Z

SUCCESS: the output of run-make-check.sh on c3edf67 is http://paste.pound-python.org/show/RcxenPns9Pdz28O1g1iB/

Sent from GH.

ghost · 2015-01-06T21:30:50Z

running in gitbuilder

http://tracker.ceph.com/issues/10349 Fixes: #10349 Signed-off-by: Loic Dachary <ldachary@redhat.com>

It is common for people to try to map 9 OSDs out of a 9 OSDs total ceph cluster. The default tries (50) will frequently lead to bad mappings for this use case. Changing it to 100 makes no significant CPU performance difference, as tested manually by running crushtool on one million mappings. http://tracker.ceph.com/issues/10353 Fixes: #10353 Signed-off-by: Loic Dachary <ldachary@redhat.com>

The ruleset created for an erasure coded pool has max_size set to a fixed value of 20, which may be incorrect when more than 20 chunks are needed and lead to obscure errors. Set it to the number of chunks, i.e. k+m most of the time. In a cluster with few OSDs (9 for instance), setting max_size to 20 causes performance problems when injecting a new crushmap. The monitor will call CrushTester::test which tries 1024 mappins for all sizes ranging from min_size to max_size. Each attempt to map more OSDs than available will exhaust all retries (50 by default) and it takes a significant amount of time. In a cluster with 9 OSDs, testing one such ruleset can take up to 5 seconds. Since the test blocks the monitor leader, a few erasure coded rulesets will block the monitor long enough to exceed the timeouts and trigger an election. http://tracker.ceph.com/issues/10363 Fixes: #10363 Signed-off-by: Loic Dachary <ldachary@redhat.com>

Add a new section to the PG troubleshooting section that covers the most common problems reported when an erasure coded pool fails to properly map PGs to enough OSDs. http://tracker.ceph.com/issues/10350 Fixes: #10350 Signed-off-by: Loic Dachary <ldachary@redhat.com>

Use different erasure coded pool names and profiles to avoid deletion / creation races. The more expensive alternative is to run a different cluster for each test. Signed-off-by: Loic Dachary <ldachary@redhat.com>

loic-bot · 2015-01-15T22:14:24Z

SUCCESS: the output of run-make-check.sh on centos-centos7 for ac051fe is http://paste2.org/dFMbjVBs

Sent from GH.

…ries resolve and document most common erasure coded pool pain points Documentation-Reviewed-by: Italo Santos <okdokk@gmail.com>

ghost added bug-fix core labels Dec 17, 2014

ghost changed the title ~~resolve and document most common erasure coded pool pain points~~ DNM: resolve and document most common erasure coded pool pain points Dec 19, 2014

ghost changed the title ~~DNM: resolve and document most common erasure coded pool pain points~~ resolve and document most common erasure coded pool pain points Jan 8, 2015

ghost assigned liewegas Jan 8, 2015

ldachary added 5 commits January 15, 2015 21:26

crush: update tries statistics for indep rules

4d07a32

http://tracker.ceph.com/issues/10349 Fixes: #10349 Signed-off-by: Loic Dachary <ldachary@redhat.com>

erasure-code: tests use different pool/profile names

dac666f

Use different erasure coded pool names and profiles to avoid deletion / creation races. The more expensive alternative is to run a different cluster for each test. Signed-off-by: Loic Dachary <ldachary@redhat.com>

ghost added the needs-qa label Jan 15, 2015

liewegas added the wip-sage-testing label Jan 17, 2015

liewegas added a commit that referenced this pull request Jan 18, 2015

Merge pull request #3194 from dachary/wip-10350-erasure-code-choose-t…

31eb4c6

…ries resolve and document most common erasure coded pool pain points Documentation-Reviewed-by: Italo Santos <okdokk@gmail.com>

liewegas merged commit 31eb4c6 into ceph:master Jan 18, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resolve and document most common erasure coded pool pain points #3194

resolve and document most common erasure coded pool pain points #3194

ghost commented Dec 17, 2014

loic-bot commented Dec 17, 2014

ghost commented Dec 17, 2014

loic-bot commented Dec 17, 2014

loic-bot commented Dec 17, 2014

ghost commented Jan 6, 2015

loic-bot commented Jan 6, 2015

loic-bot commented Jan 6, 2015

ghost commented Jan 6, 2015

loic-bot commented Jan 15, 2015

resolve and document most common erasure coded pool pain points #3194

resolve and document most common erasure coded pool pain points #3194

Conversation

ghost commented Dec 17, 2014

loic-bot commented Dec 17, 2014

ghost commented Dec 17, 2014

loic-bot commented Dec 17, 2014

loic-bot commented Dec 17, 2014

ghost commented Jan 6, 2015

loic-bot commented Jan 6, 2015

loic-bot commented Jan 6, 2015

ghost commented Jan 6, 2015

loic-bot commented Jan 15, 2015